29 research outputs found

    New Grapheme Generation Rules for Two-Stage Modelbased Grapheme-to-Phoneme Conversion

    Get PDF
    The precise conversion of arbitrary text into its  corresponding phoneme sequence (grapheme-to-phoneme or G2P conversion) is implemented in speech synthesis and recognition, pronunciation learning software, spoken term detection and spoken document retrieval systems. Because the quality of this module plays an important role in the performance of such systems and many problems regarding G2P conversion have been reported, we propose a novel two-stage model-based approach, which is implemented using an existing weighted finite-state transducer-based G2P conversion framework, to improve the performance of the G2P conversion model. The first-stage model is built for automatic conversion of words  to phonemes, while  the second-stage  model utilizes the input graphemes and output phonemes obtained from the first stage to determine the best final output phoneme sequence. Additionally, we designed new grapheme generation rules, which enable extra detail for the vowel and consonant graphemes appearing within a word. When compared with previous approaches, the evaluation results indicate that our approach using rules focusing on the vowel graphemes slightly improved the accuracy of the out-of-vocabulary dataset and consistently increased the accuracy of the in-vocabulary dataset

    Segmental Duration Control Based on an Articulatory Model

    Get PDF
    This paper proposes a new method that determines segmental duration for text-to-speech conversion based on the movement of articulatory organs which compose an articulatory model. The articulatory model comprises four time-variable articulatory parameters representing the conditions of articulatory organs whose physical restriction seems to significantly influence the segmental duration. The parameters are controlled according to an input sequence of phonetic symbols, following which segmental duration is determined based on the variation of the articulatory parameters. The proposed method is evaluated through an experiment using a Japanese speech database that consists of 150 phonetically balanced sentences. The results indicate that the mean square error of predicted segmental duration is approximately 15[ms] for the closed set and 15-17[ms] for the open set. The error is within 20[ms], the level of acceptability for distortion of segmental duration without loss of naturalness, and hence the method is proved to effectively predict segmental duration

    Open-source Software for Developing Anthropomorphic Spoken Dialog Agents

    Get PDF
    An architecture for highly-interactive human-like spoken-dialog agent is discussed in this paper. In order to easily integrate the modules of different characteristics including speech recognizer, speech synthesizer, facial-image synthesizer and dialog controller, each module is modeled as a virtual machine that has a simple common interface and is connected to each other through a broker (communication manager). The agent system under development is supported by the IPA and it will be publicly available as a software toolkit this year

    A Model of Belief Formation Based on Causality and Application to N-armed Bandit Problem

    No full text

    Regional variation in survival following pediatric out-of-hospital cardiac arrest

    Get PDF
    Background: Although regional variation in outcome after adult out-of-hospital cardiac arrest (OHCA) is known, no clinical studies have assessed this in pediatric OHCA. Methods and Results: This nationwide, prospective, population-based observation of the whole of Japan included consecutive OHCA patients with resuscitation attempt from January 2005 through December 2009. Primary outcome was 1-month survival with neurologically favorable outcome. Japan was divided into the following 7 regions as the largest administrative units: Hokkaido-Tohoku, Kanto, Tokai-Hokuriku, Kinki, Chugoku, Shikoku, and Kyushu-Okinawa. The outcome of pediatric OHCA was then compared between the regions. Multiple logistic regression analysis was used to adjust for other factors that were considered to influence the relationship between region and outcome. A total of 8,240 pediatric OHCA patients were registered during the study period. One-month survival with neurologically favorable outcome significantly differed by region: 2.5% (24/967) in Hokkaido-Tohoku (adjusted odds ratio [AOR], 1.65; 95% confidence interval [CI]: 0.94-2.90), 2.9% (47/1614) in Tokai-Hokuriku (AOR, 2.06; 95% CI: 1.28-3.31), 2.1% (26/1239) in Kinki (AOR, 1.45; 95% CI: 0.84-2.51), 3.4% (16/465) in Chugoku (AOR, 3.11; 95% CI: 1.62-6.00), 1.5% (4/259) in Shikoku (AOR, 0.79; 95% CI: 0.26-2.43), and 2.8% (27/974) in Kyushu-Okinawa (AOR, 2.15; 95% CI: 1.24-3.74) referred to Kanto (1.4%, 37/2722). Conclusions: According to Japanese nationwide OHCA registry data there are significant regional variations in the outcome of pediatric OHCA.Yoshio Okamoto, Taku Iwami, Tetsuhisa Kitamura, Masahiko Nitta, Atsushi Hiraide, Tsuneo Morishima, Takashi Kawamura, Regional Variation in Survival Following Pediatric Out-of-Hospital Cardiac Arrest, Circulation Journal, 2013, Volume 77, Issue 10, Pages 2596-2603, Released September 25, 2013, [Advance publication] Released July 04, 2013, Online ISSN 1347-4820, Print ISSN 1346-9843, https://doi.org/10.1253/circj.CJ-12-1604, https://www.jstage.jst.go.jp/article/circj/77/10/77_CJ-12-1604/_article/-char/e

    Development of a Toolkit for Spoken Dialog Systems with an Anthropomorphic Agent: Galatea

    Get PDF
    The Interactive Speech Technology Consortium (ISTC) has been developing a toolkit called Galatea that comprises four fundamental modules for speech recognition, speech synthesis, face synthesis, and dialog control, that can be used to realize an interface for spoken dialog systems with an anthropomorphic agent. This paper describes the development of the Galatea toolkit and the functions of each module; in addition, it discusses the standardization of the description of multi-modal interactions.APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference. 4-7 October 2009. Sapporo, Japan. Oral session: Infrastructure Software for Speech Processing (5 October 2009)

    Reduction of Phosphatic and Potash Fertilizer in Sweet Corn Production by Pre-transplanting Application of Potassium Phosphate to Plug Seedlings

    No full text
    To develop a new fertilizing system with a reduced amount of phosphatic fertilizer in sweet corn production, we applied potassium phosphate to the plug seedlings before transplanting to the field, and examined its effects on growth, yield, photosynthetic activity and absorption of minerals. The amount of phosphatic and potash fertilizers necessary to grow sweet corn could be reduced by the pre-transplanting KP application (PTKPA) to the plug seedlings. We considered the mechanisms involved in the reduction of P and K application rate by PTKPA as follows; 1) PTKPA increased the P content of plant, which accelerated the root establishment. 2) The advanced root establishment not only reduced the duration of water stress, but also increased absorption of the essential nutrients such as N and Mg. 3) Higher content of N and Mg led to higher chlorophyll content and possibly protein content, which activated photosynthesis during the early growth stage. 4) Improved photosynthetic activities increased NAR during the early growth stage. 5) This increase in NAR accelerated leaf expansion, increasing LAI. 6) Larger LAI during the early growth stage led to larger LAI throughout the growing stage, resulting in a higher CGR and ear yield
    corecore